From policies to influences: a framework for nonlocal abstraction in transition-dependent Dec-POMDP agents

نویسندگان

  • Stefan J. Witwicki
  • Edmund H. Durfee
چکیده

Decentralized Partially-Observable Markov Decision Processes (Dec-POMDPs) are powerful theoretical models for deriving optimal coordination policies of agent teams in environments with uncertainty. Unfortunately, their general NEXP solution complexity [3] presents significant challenges when applying them to real-world problems, particularly those involving teams of more than two agents. Inevitably, the policy space becomes intractably large as agents coordinate joint decisions that are based on dissimilar beliefs about an uncertain world state and that involve performing actions with stochastic effects. Our work directly confronts the policy space explosion with the intuition that instead of coordinating all policy decisions, agents need only coordinate abstractions of their policies that constitute the essential influences that they exert on each other. As a running example, consider the problem shown in Figure 1, involving two interacting rover agents (among a team of several others) that are exploring the surface of Mars. As shown, the agents perform various tasks (constrained to take place within a window of execution) with nondeterministic duration (D) and quality (Q) outcomes, and in performing their tasks may alter the outcomes of other agents’ tasks. Here, agent 1 may choose to visit and prepare research site C, which will (in expectation) make agent 2’s analysis of site C quicker and more valuable. This problem can be expressed Cite as: From Policies to Influences: A Framework for Nonlocal Abstraction in Transition-Dependent Dec-POMDP Agents (Extended Abstract), S. Witwicki and E. Durfee, Proc. of 9th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2010), van der Hoek, Kaminka, Lespérance, Luck and Sen (eds.), May, 10–14, 2010, Toronto, Canada, pp. Copyright c © 2010, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved. Analyze C outcome:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Influence-Based Policy Abstraction for Weakly-Coupled Dec-POMDPs

Decentralized POMDPs are powerful theoretical models for coordinating agents’ decisions in uncertain environments, but the generally-intractable complexity of optimal joint policy construction presents a significant obstacle in applying Dec-POMDPs to problems where many agents face many policy choices. Here, we argue that when most agent choices are independent of other agents’ choices, much of...

متن کامل

A POMDP Framework to Find Optimal Inspection and Maintenance Policies via Availability and Profit Maximization for Manufacturing Systems

Maintenance can be the factor of either increasing or decreasing system's availability, so it is valuable work to evaluate a maintenance policy from cost and availability point of view, simultaneously and according to decision maker's priorities. This study proposes a Partially Observable Markov Decision Process (POMDP) framework for a partially observable and stochastically deteriorating syste...

متن کامل

Decentralized POMDPs

This chapter presents an overview of the decentralized POMDP (Dec-POMDP) framework. In a Dec-POMDP, a team of agents collaborates to maximize a global reward based on local information only. This means that agents do not observe a Markovian signal during execution and therefore the agents’ individual policies map from histories to actions. Searching for an optimal joint policy is an extremely h...

متن کامل

Informed Initial Policies for Learning in Dec-POMDPs

Decentralized partially observable Markov decision processes (Dec-POMDPs) offer a formal model for planning in cooperative multi-agent systems where agents operate with noisy sensors and actuators and local information. While many techniques have been developed for solving DecPOMDPs exactly and approximately, they have been primarily centralized and reliant on knowledge of the model parameters....

متن کامل

Periodic Finite State Controllers for Efficient POMDP and DEC-POMDP Planning

Applications such as robot control and wireless communication require planning under uncertainty. Partially observable Markov decision processes (POMDPs) plan policies for single agents under uncertainty and their decentralized versions (DEC-POMDPs) find a policy for multiple agents. The policy in infinite-horizon POMDP and DEC-POMDP problems has been represented as finite state controllers (FS...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010